Phonotactic Language Identification for Singing

نویسنده

Anna M. Kruspe

چکیده

In the past decades, many successful approaches for language identification have been published. However, almost none of these approaches were developed with singing in mind. Singing has a lot of characteristics that differ from speech, such as a wider variance of fundamental frequencies and phoneme durations, vibrato, pronunciation differences, and different semantic content. We present a new phonotactic language identification system for singing based on phoneme posteriorgrams. These posteriorgrams were extracted using acoustic models trained on English speech (TIMIT) and on an unannotated English-language acapella singing dataset (DAMP). SVM models were then trained on phoneme statistics. The models are evaluated on a set of amateur singing recordings from YouTube, and, for comparison, on the OGI Multilanguage corpus. While the results on a-capella singing are somewhat worse than the ones previously obtained using i-vector extraction, this approach is easier to implement. Phoneme posteriorgrams need to be extracted for many applications, and can easily be employed for language identification using this approach. The results on singing improve significantly when the utilized acoustic models have also been trained on singing. Interestingly, the best results on the OGI speech corpus are also obtained when acoustic models trained on singing are used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing different model configurations for language identification using a phonotactic approach

In this paper different model configurations for language identification using a phonotactic approach are explored. Identification experiments were carried out on the 11-language telephone speech corpus OGI-TS, containing calls in French, English, German, Spanish, Japanese, Korean, Mandarin, Tamil, Farsi, Hindi, and Vietnamese. Phone sequences output by one or multiple phone recognizers are res...

متن کامل

An efficient phonotactic-acoustic system for language identification

This paper presents a combined two-component system for language identiication based on phonotactic and acoustic features. The phonotactic part consisting of a multilingual phone-recognizer with a double bigram-decoding architecture and a phonetic-context mapping is supported by a second part with pronunciation modeling of the recognized phone-sequence using Gaussian density models. Both parts ...

متن کامل

Fusion of contrastive acoustic models for parallel phonotactic spoken language identification

This paper investigates combining contrastive acoustic models for parallel phonotactic language identification systems. PRLM, a typical phonotactic system, uses a phone recogniser to extract phonotactic information from the speech data. Combining multiple PRLM systems together forms a Parallel PRLM (PPRLM) system. A standard PPRLM system utilises multiple phone recognisers trained on different ...

متن کامل

Automatic language identification using a segment-based approach

Automatic Language Identification (ALI) is the problem of automatically identifying the language of an utterance through the use of a computer. In 1977, House and Neuburg proposed an approach to ALI which focused on the phonotactic constraints of different languages. Their work suggested that simple language models could be used effectively for language identification if an accurate phonetic re...

متن کامل

Phonotactic spoken language identification with limited training data

We investigate the addition of a new language, for which limited resources are available, to a phonotactic language identification system. Two classes of approaches are studied: in the first class, only existing phonetic recognizers are employed, whereas an additional phonetic recognizer in the new language is created for the second class. It is found that the number of acoustic recognizers emp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Phonotactic Language Identification for Singing

نویسنده

چکیده

منابع مشابه

Comparing different model configurations for language identification using a phonotactic approach

An efficient phonotactic-acoustic system for language identification

Fusion of contrastive acoustic models for parallel phonotactic spoken language identification

Automatic language identification using a segment-based approach

Phonotactic spoken language identification with limited training data

عنوان ژورنال:

اشتراک گذاری